# 23 On functions.

Date modified: Wed 2024-07-31 . 12:10 PM
Date created: Wed 2024-07-31 . 12:10 PM
# 23 On functions. We are quite comfortable with the idea of a function, a typical mental model is that it is a "machine" that turns an input into one output. The key feature of a function is: **For each input, the function yields one output**. This is a very good mental model that works, and we like to add a few more language to it. Before we start, we remark the words **map** and **transformation** both just mean function. ## Domain, codomain, range, and well-definiteness. A function consists of four pieces of information, its **name**, its **domain**, its **codomain**, and its **rule**. We write $$ f:A\to B $$to mean here we have a function $f$, whose **domain** is the set $A$, and whose codomain is the set $B$. We can read "$f:A\to B$" as "$f$ is a function from $A$ to $B$". We sometimes write $\text{dom}(f)$ and $\text{codom}(f)$ to denote the domain and codomain of the function $f$. We also use the following diagram to express $f:A\to B$, like so $$ A\xrightarrow{f}B. $$ We can also visualize it in this diagram ![[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.38.45.excalidraw.svg]] %%[[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.38.45.excalidraw.md|🖋 Edit in Excalidraw]]%% To specify its rule, we would write $$ f(x)= \cdots $$ or $$ x \mapsto \cdots. $$The arrow with a vertical bar $\mapsto$ is read as "maps to". By the way, not every function has a rule that can be easily described by a "formula" -- sometimes the rule is the result of a procedure or algorithm. Regardless of the rule or procedure, the result of the function on an input must be unique. For example, here is a familiar function from calculus $$ f:\mathbb{R} \to \mathbb{R}, x\mapsto x^{2}. $$This is sometimes just written $f(x)=x^{2}$ in your previous life time, but it is vague without specifying its domain and codomain, we will see in a bit why. Here is another function that we have encountered. For any fixed positive integers $n,k$, $$ \begin{align*} \text{rref}: M_{n\times k}(\mathbb{R}) &\to M_{n\times k}(\mathbb{R})\\ A&\mapsto \begin{array}{l} \text{the reduced row } \\ \text{echelon form of $A$.} \end{array} \end{align*} $$ is a well-defined function. Indeed, for any $n\times k$ matrix $A$, there is precisely one RREF we can write that comes form $A$. Note the rule of the $\text{rref}$ function takes some work to describe, as to obtain the RREF of a matrix we have to perform a procedure. And it actually requires a proof (which I didn't show, but we just believed) to show it is indeed a function -- that we have a unique well-defined RREF of a matrix. On the other hand, the following is **not** a function: $$ \begin{align*} \text{ef}: M_{n\times k}(\mathbb{R}) &\to M_{n\times k}(\mathbb{R})\\ A&\mapsto\text{an echelon form of $A$ } . \end{align*} $$Indeed, if we are to find "an echelon form of $A$", there are many possibilities! Since there is no one single output per input, this is not a function. Sometimes we say this "function" is **not well-defined**. Now, the **domain** of a function $f$ is the set of all permissible inputs that you are considering. That is to say, your rule needs to be defined for each $x \in\text{dom}(f)$. Now, the domain of a function that you should specify when you describe your function. Quite often in your previous lifetime you are asked to "find the domain of $f$", which honestly is a bit nonsensical since domain should be supplied with the function. However, it is also not entirely nonsensical, since it really means "give the largest domain you can have for the function $f$, within reason", when a domain is not specified. The **codomain** of a function $f$ is just any set where it contains all the outputs of $f$, while the **range** (or some times called the **image**) of a function $f$ is the set of actual outputs of $f$ on each input from the domain of $f$. A, $$ \text{range}(f) = \{f(x):x\in \text{dom}(f)\}. $$Sometimes for a function $f:A\to B$, we write $f(A)$ to denote the range of $f$ as well. Again we emphasize, $$ \text{range}(f) \subset \text{codom}(f). $$ And here is a diagram that helps you visualize ![[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.46.36.excalidraw.svg]] %%[[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.46.36.excalidraw.md|🖋 Edit in Excalidraw]]%% For example, let us consider the function $$ f:\mathbb{R}\to \mathbb{R},\ x\mapsto x^{2} $$The name of this function is $f$, the domain is $\mathbb{R}$, the codomain is also $\mathbb{R}$, the rule is $x\mapsto x^{2}$. The range of this function however is the real interval $[0,\infty)$. So the codomain need not equal to the range! But the codomain needs to be a set that contains the range for the function to be well-defined. ## Image and pre-image under a function. Some more jargons and terminologies. Consider a function $f:A\to B$, for a given $x \in A$, we say $f(x)$ is the **image** of the point $x$ under $f$. We also say $f(x)$ is the **value** of $x$ under $f$. Now, just as how $f(A)$ is the **range** of $f$, if $S\subset A$ , then we define $f(S)$ to be the image of the subset $S$ under $f$, and it consists of all outputs of $f$ on each input from $S$. That is, $$ f(S) = \{f(x):x\in S\}. $$ Of course, we have $f(S)\subset\text{range}(f)\subset\text{codom}(f)$. Diagrammatically, ![[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.57.25.excalidraw.svg]] %%[[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 09.57.25.excalidraw|🖋 Edit in Excalidraw]]%% Now, consider a function $f:A\to B$. Let us fix a point $b \in B$ from the codomain. We denote the **pre-image** of $b$ to be the set of all inputs $x$ from $A$ such that $f(x)=b$, namely $$ f^{-1}(b) = \{x\in A : f(x) = b\} $$ Diagrammatically we have ![[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.03.01.excalidraw.svg]] %%[[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.03.01.excalidraw.md|🖋 Edit in Excalidraw]]%% **CAUTION.** This is an **unfortunate** notation, because it looks like "inverse", but it's not necessarily the inverse function, as we will see that the inverse of a function need not always exist, while the pre-image of a point in the codomain is always a set. It's about the set of all inputs that has image $b$ under $f$. The result of $f^{-1}(b)$ is a set of points. To avoid confusion with "inverse", sometimes we write $$ f^{\leftarrow}(b) \quad \text{to mean } \quad f^{-1}(b) $$to mean the pre-image of $b$. We can also speak of the pre-image of a set. Given a function $f:A\to B$, if $U\subset B$, then we say the **pre-image of the set $U$ under the map $f$** is the set of all inputs whose image is in $U$, and denote it as $f^{-1}(U)$. That is $$ f^{-1}(U) = \{x\in A:f(x)\in U\}. $$Again to avoid confusion with the inverse function (which again may or may not even exist!), you can write $$ f^{\leftarrow}(U)\quad \text{to mean}\quad f^{-1}(U). $$ And diagrammatically we have ![[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.09.12.excalidraw.svg]] %%[[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.09.12.excalidraw.md|🖋 Edit in Excalidraw]]%% **Example.** Consider again $f:\mathbb{R}\to \mathbb{R}$ with $f(x)=x^{2}$. Then $f^{-1}(1)=\{1,-1\}$, $f^{-1}(0)=\{0\}$, and $f^{-1}(-1)=\varnothing$. And if we have the set $[1,2]$, then $f^{-1}([1,2])=[-\sqrt{2},-1]\cup[1,\sqrt{2}]$. Draw a picture to help you see this. ## Surjectivity. As remarked earlier, the range and the codomain of a function $f$ need not be the same, we merely need $\text{range(f)}\subset \text{codom}(f)$. However, when they are equal then we have a special situation: > Definition. We say a function $f:A \to B$ is **surjective** if its range equal to its codomain, that is $$ \text{range}(f) = B. $$ We also use the adjective **onto** to describe a surjective function. If a function is surjective, we say it is a **surjection**. Diagrammatically, ![[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.17.05.excalidraw.svg]] %%[[smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.17.05.excalidraw.md|🖋 Edit in Excalidraw]]%% **Example.** The function $f:\mathbb{R} \to \mathbb{R}$ where $f(x)=x^{2}$ is not surjective, as $\text{range}(f)=[0,\infty) \neq \text{codom}(f)=\mathbb{R}$. However the function $g:\mathbb{R}\to [0,\infty)$ where $g(x)=x^{2}$ is indeed surjective. Note, the general question of whether a function is surjective or not is not that easy to determine, this is due to the range of a function is not necessarily easy to determine. We often need to invoke many tools to determine it. In calculus, we developed a set of tools that might help us, such as limits and derivatives. Let me illustrate a full example of it (briefly read the example, don't get scared): **Example.** Is the function $f: (0, \infty)\to \mathbb{R}$ given by $f(x) = \frac{\ln(x)}{x^{2}}$ surjective? What is the range of $f$? From calculus, we see that the limit $$ \lim_{x\to 0^{+}} f(x)=-\infty $$and $$ \lim_{x\to\infty} f(x) = 0 $$Then with the fact that $f$ is continuous, we can show $f$ is bounded above by some number and hence its range cannot be all of $\mathbb{R}$. So $f$ is not surjective. But to show this it requires a bit work. It requires an argument in **analysis**, which goes as follows: Since $\lim_{x\to 0^{-}}f(x)=-\infty$, this means there exists some $a > 0$ such that when $0< x < a$ we have $f(x) < 0$. And as $\lim_{x\to \infty} f(x) = 0$, it implies there exists some $b$ such that $x > b$ implies $f(x) < 1$. These conditions are derived from the definitions of these limits. And finally, on the closed interval $[a,b]$, by **extreme value theorem**, as $f$ is continuous on $[a,b]$, $f$ has a maximum over $[a,b]$, that is $f(x) \le M$ for all $x\in [a,b]$, for some $M$. Putting all together, $f(x) < 0$ on the interval $(0,a)$, $f(x) < M$ on the interval $[a,b]$, and $f(x) < 1$ on the interval $(b,\infty)$, we see that $f$ is bounded above, and its range is not all of the reals. So $f$ is not surjective. This is a kind of argument in the mathematical subject of **analysis**, hypothetically do-able for a calculus student who has learned all the definitions, but certainly expected of a student in analysis. Now to determine the actual range of $f$, we invoke calculus and analysis once more: Since $f'(x) = \frac{1 - 2 \ln(x)}{x^{3}}$, which has critical point at $x = \sqrt{e}$, and that by first derivative test, we have $f'(x) > 0$ when $x \in (0,\sqrt{e})$ and $f'(x) < 0$ when $x\in (\sqrt{e},\infty)$, we conclude that $x=\sqrt{e}$ is a global maximum, with value $f(x) = \frac{1}{2e}$. So $\text{range}(f)=(-\infty, \frac{1}{2e}]$. $\blacklozenge$ **Ok what is the bottom-line here:** The point is: **In general the range of a function is not easily determined, nor whether it is surjective or not!** It could require sophisticated tools and careful analysis! For instance, what about functions whose rule is $f(x) = \frac{x^{3}-2x^{2}-20}{5+2\cos(3x+1)}$? Or $f(x,y) = \frac{\sin(x+y)}{1+x^{2}+y2}$? Later we will see that the type of functions we are gonna be concerned with in linear algebra, so-called **linear transformations**, this question of surjectivity is not too bad, within reason. ## Injectivity. The cousin of surjectivity is **injectivity**. We make the following definition: > Definition. We say a function $f:A \to B$ is **injective** if whenever $x\neq y$ in $A$, we have $f(x)\neq f(y)$. In this case, we also say $f$ is **one-to-one**, and we use the noun $f$ is an **injection**. A logical equivalent way of saying $f$ is injective is: whenever $f(x)=f(y)$, then we must have $x=y$. In calculus-land, a function is injective if it passes the **horizontal line test**, which embodies what it means here -- every output arise from at most one input. Diagrammatically we have![[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.19.51.excalidraw.svg]] %%[[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 10.19.51.excalidraw|🖋 Edit in Excalidraw]]%% **Example.** Is $f(x)=x^{2}:\mathbb{R}\to\mathbb{R}$ injective? (Make note of this super short hand of writing a function.) Well, no, $f$ is not injective, since $f(1)=1=f(-1)$. Again, it is not always easy to determine injectivity. In calculus we do have some tools, for instance we have > Let $f:(a,b)\to \mathbb{R}$ be any function. > If $f$ is **strictly increasing**, namely $x < y$ implies $f(x) < f(y)$, for all $x,y\in (a,b)$, then $f$ is injective. > If $f$ is **strictly decreasing**, namely $x < y$ implies $f(x) > f(y)$, for all $x,y\in (a,b)$, then $f$ is injective. If further $f$ is differentiable on $(a,b)$, we also have > If $f' > 0$ on $(a,b)$, then $f$ is strictly increasing on $(a,b)$. > If $f' < 0$ on $(a,b)$, then $f$ is strictly decreasing on $(a,b)$. So together we have a calculus way of determining injectivity, provided $f$ is differentiable. ## Identity function. For any set $A$ we can always define a special function called **the identify function on $A$**, denoted as $\text{id}_{A}:A\to A$, or $\mathbb{1}_{A}:A\to A$, or $\mathbf{1}_{A}:A\to A$, given by the rule $$ \text{id}_{A}(x)=x, \quad\text{for all $x\in A$.} $$That is, the identity function does nothing to its input and returns the same input as output. It is the "touch nothing function" or "do nothing function". Convince yourself that $\text{id}_{A}$ is both injective and surjective. ## Composition of functions, operators, and iterations. Given two functions $f:A\to B$ and $g:B \to C$, then it makes sense to speak of their **composition** $g\circ f$, which is a function $g\circ f: A\to C$, given by $(g\circ f)(x):=g(f(x))$. By the way, we will also use diagrams to denote this composition like this: $$ A\xrightarrow{f}B\xrightarrow{g} C.$$ Notice how we need the codomain of $f$ match the domain of $g$ for the composition $g\circ f$ to make sense. Also pay attention to the ordering -- the composition $f\circ g$ is not the same as $g\circ f$, nor is it necessarily well-defined! (Does this reminds you of matrices $PQ\neq QP$ in general?) Composition with the identity function gives the same function back, provided the domain or codomain matches. That is, if we have $f:A\to B$, then $f\circ\text{id}_{A}=f$ and $\text{id}_{B}\circ f =f$. Think of these diagrammatically, $A\xrightarrow{\text{id}_{A}}A\xrightarrow{f}B$ is the same as $A\xrightarrow{f}B$; and that $A\xrightarrow{f}B\xrightarrow{\text{id}_{B}}B$ is the same as $A\xrightarrow{f}B$. Though function composition is not necessarily commutative, it is however **associative**. That is, we have > Suppose we have functions $f:A\to B$, $g:B\to C$, $h:C\to D$. Then $h\circ (g\circ f)=(h\circ g)\circ f$. So it makes sense to write the ambiguous statement $h\circ g\circ f$, and draw the diagram $$ A\xrightarrow{f}B\xrightarrow{g}C\xrightarrow{h}D. $$ $\blacktriangleright$ Proof. This has a simple element-wise proof. Take any $a\in A$. Note that $(h\circ(g\circ f))(a)=h((g\circ f)(a))=h(g(f(a)))$, and that $((h\circ g)\circ f)(a)=(h\circ g)(f(a))=h(g(f(a)))$, which is the same. $\blacksquare$ Sometimes a function $f$ has the same domain as its codomain, where $f:A\to A$. In this case, it makes sense to speak of $f\circ f$, the composition of $f$ with itself. And in this case we say $f$ is an **operator**. So an operator is a function that can self-compose. And we can self-compose repeatedly so. Repeated self-composition is also called **iteration**, and we denote $$ f^{n}=\underbrace{f\circ f\circ \cdots \circ f}_{\text{$n$ times}}, $$for **the $n$-th power of $f$** This is yet-another "unfortunate" notation, as it gets confused with another familiar notation. See example: **Example.** Let $f:\mathbb{R} \to \mathbb{R}$ be given by $f(x)=x+1$. Then the function $f^{2}:\mathbb{R}\to \mathbb{R}$ is $f^{2}(x)=f(f(x))=x+2$ . This is **not to be confused with the function** $g$ given by $g(x)=f(x)^{2}$, which takes the value of $f(x)$ and squared. So $g(x)=x^{2}+2x+1$. Got it?! If $f:A\to A$ is an operator, then we define $f^{0}=\text{id}_{A}$, where $\text{id}_{A}(x)=x$ for each $x\in A$, which is the identity function on $A$. If $f$ is invertible (described next), then $f^{-1}$ denotes the inverse of $f$, and that we define $$ f^{-n}=(f^{-1})^{n}=\underbrace{f^{-1}\circ f^{-1} \circ \cdots \circ f^{-1}}_{\text{$n$ times}}. $$ ## Bijectivity and invertibility. If a function $f:A \to B$ is **both surjective and injective**, we say $f$ is **bijective**. Something special happens when a function $f$ is bijective, it is in fact **invertible**. What does it mean for a function to be invertible? Here is its definition > **Definition**. A function $f:A \to B$ is said to be **invertible** if there exists another function, commonly denoted as $f^{-1}:B\to A$, such that the compositions $$ (f^{-1}\circ f)(x)=x, \quad \text{for all $x \in A$}, $$and $$ (f\circ f^{-1})(x) = x,\quad\text{for all $x \in B$}. $$ This $f^{-1}$ is said to be an **inverse function** of $f$. In fact, this inverse function is unique, so it makes sense to speak of **the** inverse function of $f$, when $f$ is invertible. [[smc-spring-2024-math-13/linear-algebra-notes/23b-uniqueness-of-inverse-function|See notes 23b]]. The inverse function of $f$, if exists, is a function whose job is to undo the action of $f$, and vice versa -- that $f$ undoes the action of $f^{-1}$. Diagrammatically we have ![[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 12.10.02.excalidraw.svg]] %%[[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/---files/23-on-functions 2024-03-22 12.10.02.excalidraw|🖋 Edit in Excalidraw]]%% It is an important characterization that > **Important characterization.** A function $f:A \to B$ is bijective if and only if $f$ is invertible. The motivated reader can try to prove above characterization or [[1 teaching/smc-spring-2024-math-13/linear-algebra-notes/23a-bijective-iff-invertible|read its proof here in notes 23a]], but we can take this set-theoretical result as **fact**. This proposition is important and we will use it often. Note, the function that takes in each $x \in A$, and just outputs $x$ back is called the **identity function** on $A$, and we write $\text{id}_{A}$. Similarly, the function that takes in each $x\in B$ and just outputs $x$ back is called the identity function on $B$, and we write $\text{id}_{B}$. So, if $f:A\to B$ has inverse function $f^{-1}:B\to A$, then we have the compositions $$ f^{-1}\circ f=\text{id}_{A} \quad\text{and}\quad f\circ f^{-1} = \text{id}_{B}. $$ Whew! Ok. Take a moment to digest all of these. **Example.** The function $f(x)=x^{2}:\mathbb{R} \to \mathbb{R}$ is neither injective nor surjective. This $f$ is not bijective, and $f$ has no inverse function, and $f$ is not invertible. **Example.** The function $f(x)=x^{2}:[0,\infty) \to \mathbb{R}$ is injective but not surjective. This $f$ is not bijective, and $f$ has no inverse function, and $f$ is not invertible. **Example.** The function $f(x)=x^{2}:[0,\infty) \to [0,\infty)$ is injective and surjective. This $f$ is bijective, and $f$ is invertible with an inverse function $f^{-1}$. This function $f^{-1}$ is given by $f^{-1}(x)=\sqrt{x}:[0,\infty) \to [0,\infty)$. Note the compositions $$ (f^{-1}\circ f)(x) = x \quad \text{for each } x\in [0,\infty) $$and $$ (f\circ f^{-1})(x) = x \quad \text{for each } x\in [0,\infty). $$ **Example.** The function $f(x)=x^{2}:(-\infty,0] \to [0,\infty)$ is also injective and surjective. This $f$ is bijective, and $f$ is invertible with an inverse function $f^{-1}$. This function $f^{-1}$ is given by $f^{-1}(x)=-\sqrt{x}:[0,\infty)\to(-\infty,0]$. Note the compositions $$ (f^{-1}\circ f)(x) = x \quad \text{for each } x\in (-\infty,0] $$and $$ (f\circ f^{-1})(x) = x \quad \text{for each } x\in [0,\infty). $$ Ok. I think that's enough for now before this gets too long. (I will add more relevant details if needed later.) ## Important final remark. In general, the questions of range, surjectivity, injectivity, bijectivity, and invertibility, etc of a general function is **not easy to determine**. However, in linear algebra, we consider a special kinds of functions between linear spaces called **linear transformations.** And as it turns out, it is relatively simple to determine for a linear transformation whether it is surjective, injective, invertible, or its range, especially when the dimension of the linear space is finite.